ADF Data Package Developer's Guide v1.5.3 RF

The Allotrope Data Format (ADF) [[!ADF]] consists of several APIs and taxonomies. The ADF Data Package API [[!ADF-DP]] defines an interface for storing files and folder structures and thus provides one of most essential operations of the ADF. This document provides a Developer's Guide for the Allotrope Data Format Data Package API.

Disclaimer

THESE MATERIALS ARE PROVIDED "AS IS" AND ALLOTROPE EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE WARRANTIES OF NON-INFRINGEMENT, TITLE, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

This document is part of a set of specifications on the Allotrope Data Format [[!ADF]]

Introduction

The Allotrope Data Format (ADF) defines an interface for experimental data generated in analytical laboratory processes. It is intended for data exchange, long-term preservation and fast real-time data access. The ADF Data Package (ADF-DP) API defines an interface for storing files and folder structures and thus provides one of most essential operations of the ADF.

Next, the operations on files and folders are explained along examples. Towards the end of this document example code is listed.

Document Conventions

Namespaces

Within this specification, the following namespace prefix bindings are used:

Prefix Namespace
owl:http://www.w3.org/2002/07/owl#
rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:http://www.w3.org/2000/01/rdf-schema#
xsd:http://www.w3.org/2001/XMLSchema#
skos:http://www.w3.org/2004/02/skos/core#
dct:http://purl.org/dc/terms/
foaf:http://xmlns.com/foaf/0.1/
adf-dp:http://purl.allotrope.org/ontologies/datapackage#

Indication of Requirement Levels

Within this document the definitions of MUST, SHOULD and MAY are used as defined in [[!rfc2119]].

Number Formatting

Within this document, decimal numbers will use a dot "." as the decimal mark.

ADF Data Package Operations

This section describes the operations that can be carried out with the Data Package API. First, the operations on folders are described, then the operations on files.

The terminology that is used below is defined in [[!ADF-DPO]].

Accessing the Data Package

The Data Package API provides a method that offers access to the Data Package contained in a given ADF file. The method takes as argument the ADF file and returns a reference to the Data Package contained in the file.

JAVA:
// Retrieves the "Data Package" of the opened ADF file.
DataPackage dataPackage = adfFile.getDataPackage();

C#:
// Retrieves the "Data Package" of the opened ADF file.
DataPackage dataPackage = adfFile.getDataPackage();
			

Folder Operations

The following sections describe the operations that can be carried out on folders with the Data Package API.

Creating a Folder

The Data Package API provides a method to create a folder in another folder of the Data Package. The method takes the name of the new folder as argument and returns a reference to the new folder. The method throws an Exception, if a folder or file with this name already exists in the target folder. Further, the name of the folder MUST fulfil the interoperability requirements (described in [[!ADF-DP]]). Otherwise an exception is thrown.

The following example shows the creation of a folder:

JAVA:
// Retrieves the Data Package of the opened file.
DataPackage dataPackage = adfFile.getDataPackage();

// Retrieves the root folder of the DataPackage.
DpFolder rootFolder = dataPackage.openRootFolder();

// Creates a new folder "myFolder"
DpFolder newFolder = rootFolder.createFolder("myFolder");

C#:
// Retrieves the Data Package of the opened file.
DataPackage dataPackage = adfFile.getDataPackage();

// Retrieves the root folder of the DataPackage.
DpFolder rootFolder = dataPackage.openRootFolder();

// Creates a new folder "myFolder"
DpFolder newFolder = rootFolder.createFolder("myFolder");
				

The meta-data created in the triple store look like this:

<urn:uuid:6eca6e3e-0e88-4de1-a048-334e124b0f6d>
      a       af-dp:Folder , ldp:Container ;
      dc:title "myFolder"^^xsd:string ;
      af-dp:representedBy <hdf:/data-package> .

<hdf:/data-package>
      a       hdf:Group ;
      hdf:name "data-package"^^xsd:string .
				

Accessing a Folder

The ADF-DP API provides a method to access a given folder in a Data Package. The folder MAY be given by URI, by absolute path or by relative path. The method takes the name resp. the URI of the folder as argument and returns a reference to this folder, if it exists.

The method throws an exception, if the folder given by URI or by path does not exist. The folder returned by this method MUST fulfil the requirements for accessing a folder specified in [[!ADF-DP]].

Accessing a Folder by URI

JAVA:
DpFolder folder = dataPackage.getFolderByURI("urn:uuid:1123123-123123...");

C#:
DpFolder folder = dataPackage.getFolderByURI("urn:uuid:1123123-123123...");
				

Accessing a Folder by path

JAVA:
DpFolder folder = dataPackage.getFolderByPath("/a/b/c");

C#:
DpFolder folder = dataPackage.getFolderByPath("/a/b/c");
				

Accessing a Folder by browsing

JAVA:
DpFolder folder = dataPackage.openRootFolder().openFolder("a").openFolder("b").openFolder("c");
folder.getAbsolutePath(); -> /a/b/c

JAVA:
DpFolder folder = dataPackage.openRootFolder().openFolder("a").openFolder("b").openFolder("c");
folder.getAbsolutePath(); -> /a/b/c
				

Listing the Content of a Folder

The ADF-DP API provides a method to list the contents of a folder in a given Data Package. The method returns an empty list, if the folder does not have any contents.

JAVA:
DpFolder rootFolder = dataPackage.openRootFolder();

List<DpNode> contents = rootFolder.contents();
for (DpNode content : contents)
{
	if (content instanceOf DpFolder) {
		// print out the inherent meta data of the folder, e.g. content.getName(), content.getCreationOn(), ...
	} else if (content instanceOf IFile) {
		// print out the inherent meta data of the file, e.g. content.getName(), content.getSize(), ...
	}
}

C#:
DpFolder rootFolder = dataPackage.openRootFolder();

java.util.List contents = rootFolder.contents();
foreach (DpNode content in contents)
{
	if (content is DpFolder) {
		// print out the inherent meta data of the folder, e.g. content.getName(), content.getCreationOn(), ...
	} else if (content is IFile) {
		// print out the inherent meta data of the file, e.g. content.getName(), content.getSize(), ...
	}
}
				

The meta data of files and folders to load is based on the requirements for accessing files and accessing folders specified in [[!ADF-DP]].

Renaming a Folder

The ADF-DP API provides a method to rename a folder in a given Data Package:

JAVA and C#:
// Renames the folder "oldName" to "newName"
theFolder.renameTo("newName");
				

Deleting a Folder

The ADF-DP API provides a method to delete a folder in a given Data Package:

JAVA:
// Deletes the folder "toBeDeleted"
dataPackage.deleteFolder("/helloWorld/toBeDeleted");

C#:
// Deletes the folder "toBeDeleted"
dataPackage.deleteFolder("/helloWorld/toBeDeleted");
				

File Operations

The following sections describe the operations that can be carried out on files of the Data Package with the Data Package API.

Creating a File

The following example demonstrates how a file is created:

JAVA:
// Retrieves the folder "helloWorld"
DpFolder helloWorld = dataPackage.openFolder("/helloWorld");

// Creates the file "Hello World.txt" inside the "helloWorld" folder
DpFile file = helloWorld.createFile("Hello World.txt");

C#:
// Retrieves the folder "helloWorld"
DpFolder helloWorld = dataPackage.openFolder("/helloWorld");

// Creates the file "Hello World.txt" inside the "helloWorld" folder
DpFile file = helloWorld.createFile("Hello World.txt");
				

When creating a file, a UUID URN URI is generated (file-URI) and the class of the file stated (adf-dp:File and ldp:Resource) in the ADF Triple Store:

$fileURI rdfs:type adf-dp:File, ldp:Resource.
				
<urn:uuid:fde37fc2-296a-4bbf-8f7c-219d50de2dc1>
	a      		ldp:Resource, adf-dp:File ;
	dc:title 	"fileName"^^xsd:string ;
	adf-dp:representedBy <hdf:/data-package/fileName> .

<hdf:/data-package/fileName>
	a      		hdf:Dataset ;
	hdf:name	"fileName"^^xsd:string .

Accessing a File

The ADF-DP API provides a method to access a given file in a Data Package. The file MAY be given by URI, by absolute path or by relative path (as shown in the next example). The method takes the name resp. the URI of the file as argument and returns a reference to this file, if it exists.

The method throws an exception, if the file given by URI or by path does not exist.

The file returned by this method fulfils the requirements specified in [[!ADF-DP]].

Accessing a File directly
JAVA:
// Retrieves the file "Hello World.txt" directly. The name here has to be absolute!
DpFile file1 = dataPackage.openFile("/helloWorld/Hello World.txt");

C#:
// Retrieves the file "Hello World.txt" directly. The name here has to be absolute!
DpFile file1 = dataPackage.openFile("/helloWorld/Hello World.txt");
Accessing a File from parent folder
JAVA:
// Retrieves the folder "helloWorld"
DpFolder helloWorld = dataPackage.openFolder("/helloWorld");
DpFile file1 = helloWorld.openFile("Hello World.txt");

C#:
// Retrieves the folder "helloWorld"
DpFolder helloWorld = dataPackage.openFolder("/helloWorld");
DpFile file1 = helloWorld.openFile("Hello World.txt");

Reading a File

The ADF-DP API provides methods to read a given file in a Data Package either as byte content or as an input stream.

Reading a File as byte content

The following example shows how to read the byte content directly from a file:

JAVA:
// Retrieves the file "Hello World.txt" directly.
DpFile file = dataPackage.openFile("/helloWorld/Hello World.txt");

// Reads the content of the file.
byte[] bytes = file.read();
String text = new String(bytes);

C#:
// Retrieves the file "Hello World.txt" directly.
DpFile file = dataPackage.openFile("/helloWorld/Hello World.txt");

// Reads the content of the file.
sbyte[] bytes = file.read();
String text = esme.util.StringUtil.create(bytes);
				

Reading a File as input stream

The following example shows how to read from a file through an input stream:

JAVA:
DpFile file = dataPackage.openFile("/helloWorld/streamedFile.bin");

// Creates an input stream to read from the file.
try (InputStream is1 = new DpInputStream(file))
{
	// Streaming to file
	int int1;
	while ((int1 = is1.read()) != -1)
	{
		byte byte1 = (byte) int1;
		log.info("" + byte1);
	}
}

C#:
// Creates an input stream to read from the file.
DpInputStream is1 = new DpInputStream(null);
try
{
	int int1;
	while ((int1 = is1.read()) != -1)
	{
		sbyte byte1 = (sbyte) int1;
		Log.info("" + byte1);
	}
}
finally
{
	is1.close();
}
			

Updating a File

The ADF-DP API provides a method to write bytes into a given File within a Data Package. When updating a file, the meta data of the class adf-dp:File MUST be fulfilled at modification time and updated in the ADF-TS as specified in [[!ADF-DP]].

The detailed specification for writing into a file is given in [[!ADF-DP]].

Overwriting a File

JAVA:
// Retrieves the file "Hello World.txt" directly.
DpFile file = dataPackage.openFile("/helloWorld/Hello World.txt");

// Writes new content to the file. Default is overwriting.
file.write("2, 3, 5, 7 and 11".getBytes());

// -> "2, 3, 5, 7 and 11"

C#:
// Retrieves the file "Hello World.txt" directly.
DpFile file = dataPackage.openFile("/helloWorld/Hello World.txt");

// Writes new content to the file. Default is overwriting.
file.write(esme.util.StringUtil.getBytes("2, 3, 5, 7 and 11"));

// -> "2, 3, 5, 7 and 11"
				

Appending to a File

JAVA:
// continued from previous example
file.write(" are prime numbers".getBytes(), OpenOption.APPEND);

// -> "2, 3, 5, 7 and 11 are prime numbers"

C#:
// continued from previous example
file.write(esme.util.StringUtil.getBytes(" are prime numbers"), OpenOption.APPEND);

// -> "2, 3, 5, 7 and 11 are prime numbers"
				

Writing to a File by stream

JAVA:
// Retrieves the folder "helloWorld" directly.
DpFolder helloWorld = dataPackage.openFolder("/helloWorld");

// Creates a new file to stream to.
DpFile file = helloWorld.createFile("streamedFile.bin");

int numberToWrite = 0;

// Creates an output stream to write to the new file.
try (OutputStream os = new DpOutputStream(file))
{
	while (numberToWrite < 1000)
	{
		os.write(numberToWrite++);
	}
}

C#:
// Retrieves the folder "helloWorld" directly.
DpFolder helloWorld = dataPackage.openFolder("/helloWorld");

// Creates a new file to stream to.
DpFile file = helloWorld.createFile("streamedFile.bin");

int numberToWrite = 0;

// Creates an output stream to write to the new file.
java.io.OutputStream os = new DpOutputStream(file);
try
{
	while (numberToWrite < 1000)
	{
		os.write(numberToWrite++);
	}
}
finally
{
	os.close();
}
				

Renaming a File

The ADF-DP API provides a method to rename a File in a given Data Package:

JAVA and C#:
// Renames the file "oldName.txt" to "newName.txt"
theFile.renameTo("newName.txt");
				

Deleting a File

The ADF-DP API provides a method to delete a file in a given Data Package:

JAVA:
// Deletes the file "toBeDeleted.txt"
dataPackage.deleteFile("/helloWorld/toBeDeleted.txt");

C#:
// Deletes the file "toBeDeleted.txt"
dataPackage.deleteFile("/helloWorld/toBeDeleted.txt");
				

Import and export files and folders

The ADF-DP API provides methods to import files and folders into ADF and export these files back to the system again.

Import

The import functionality imports the whole content of a folder or just a file into the ADF. It creates DpFolder and DpFile according to the hierarchy of the import content and preserves meta-data of the original files and folders. Symbolic links are automatically followed. If this behavior is not wanted, it can be turned off by org.allotrope.adf.enums.LinkOption.NOFOLLOW_LINKS.

By default files are imported as normal DpFiles with a block size of 8000 bytes. Those files can be changed and renamed like all other DpFiles. If imported with the import option org.allotrope.adf.enums.ImportOption.FIXED_SIZE the DpFiles will be stored as read-only files that use only their real size inside the ADF DataPackage.

JAVA and C#:
// import contents of system folder into ADF folder "dpFolder"
dataPackage.importIntoDpFolder(importDirectoryPath, dpFolder);

// it is also possible to just import a single file into the ADF DpFolder
dataPackage.importIntoDpFolder(anyFilePath, dpFolder);

// Symbolic links are automatically followed. If this behavior is not wanted, it can be turned off by specifying LinkOption.NOFOLLOW_LINKS
dataPackage.importIntoDpFolder(anyFilePath, dpFolder, LinkOption.NOFOLLOW_LINKS);

// Files are stored as normal DpFiles. If the size of the ADF file is important and the files will not be changed inside the ADF file, this
// can be changed by specifying ImportOption.FIXED_SIZE
dataPackage.importIntoDpFolder(anyFilePath, dpFolder, ImportOption.FIXED_SIZE);

// LinkOption.NOFOLLOW_LINKS and ImportOption.FIXED_SIZE may be used at the same time
dataPackage.importIntoDpFolder(anyFilePath, dpFolder, LinkOption.NOFOLLOW_LINKS, ImportOption.FIXED_SIZE);

		

Export

The export functionality exports the content of the ADF DpFolder or just a DpFile according to their hierarchy in the ADF to the specified system folder. Preserved meta-data is automatically restored.

JAVA and C#:
// export contents of dpFolder into system folder "exportDirectory"
dataPackage.exportDpFolder(dpFolder, exportDirectory);

// it is also possible to just export a single DpFile
dataPackage.exportDpFile(dpFile, exportDirectory);

		

Revisions of files

Changes of DpFiles done after the AuditTrail of an ADF file has been activated are available as revisions. The initial version of a file has the revision number 1. The first changed version is revision 2 and so on. For unchanged files or if the AuditTrail is not active only one revision (with the revision number 1) is available.

There are two methods on a DpFile to work with the revisions:

JAVA and C#:
// retrieve the revision number
int revisionNumber = dpFile.getRevisionNumber(); // returns e.g. 2

// retrieve the previous version
DpFile previousRevision = dpFile.getPreviousRevision(); // in this example revision 1

int olderRevisionNumber = previousRevision.getRevisionNumber(); // returns 1
DpFile olderPreviousRevision = previousRevision.getPreviousRevision(); // returns null

		

The DataPackage.openFile() method has one new overload which accepts the revision number as its second parameter. This way it is possible to directly open a specific revision of a DpFile. If the revision does not exist a FileNotFoundException will be thrown.

JAVA and C#:
// retrieve revision 3 of a DpFile
DpFile file = dataPackage.openFile("/TestFile.txt", 3);

// retrieve a not-existing revision (example DpFile has only 3 revisions)
DpFile file = dataPackage.openFile("/TestFile.txt", 4); // throws a FileNotFoundException

		

Complete Example

The ADF-DP First Steps Example Application illustrates the Java API of ADF-DP by one complete code example. It is contained in the file FirstSteps.java in the package org.allotrope.adf.dp.firststeps.

Change History

Version Release Date Remarks
0.4.0 2015-06-29
  • Initial Working Draft version
1.0.0 RC 2015-09-17
  • Renamed document from User Manual to Developer's Guide
  • Renamed section Example Code to Complete Example
  • Removed unnecessary sub section from section Complete Example
  • Added provenance information to the Complete Example
1.0.0 2015-09-29
  • Updated versions, dates and document status
  • Removed code from section Complete Example
1.1.0 RC 2016-03-11
  • Updated versions, dates and document status
  • Added section on number formatting to document conventions
  • Added information and examples for C#/.NET
1.1.0 RF 2016-03-31
  • Updated versions, dates and document status
1.1.5 2016-05-13
  • Updated versions and dates
1.2.0 Preview 2016-09-23
  • Updated versions and dates
  • Corrected the data type of the size of a DpFile from int to long
1.3.0 Preview 2017-03-31
  • Updated versions and dates
1.3.0 RF 2017-06-30
  • Updated versions and dates, authors
  • Added section on Importing and exporting files and folders
1.4.3 RC 2018-10-11
  • Added information about DpFile revisions.
  • Added information about imported files with a fixed size.
  • Added information about renaming DpFiles and DpFolders.
  • Updated versions and dates
1.4.5 RF 2018-12-17
  • Updated versions and dates
1.5.0 RC 2019-12-12
  • Updated versions and dates
1.5.3 RF 2020-11-30
  • Updated broken reference links
  • Updated PURL and DOCS server links to relative links
  • Reformat the document header